Based on the NAFlex server and in particular in its Nucleic Acids Analysis section.
This tutorial aims to illustrate the process of extracting structural and dynamical properties from a DNA MD trajectory helical parameters, step by step, using the BioExcel Building Blocks library (biobb). The particular example used is the Drew Dickerson Dodecamer sequence -CGCGAATTCGCG- (PDB code 1BNA). The trajectory used is a 500ns-long MD simulation taken from the BigNASim database (NAFlex_DDD_II entry).
git clone https://github.com/bioexcel/biobb_wf_dna_helparms.git
cd biobb_wf_dna_helparms
conda env create -f conda_env/environment.yml
conda activate biobb_dna_helparms_tutorial
jupyter-nbextension enable --py --user widgetsnbextension
jupyter-notebook biobb_wf_md_setup/notebooks/biobb_dna_helparms_tutorial.ipynb

# Auxiliar libraries
import os
import shutil
import glob
from pathlib import Path, PurePath
import zipfile
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import pandas as pd
from IPython.display import Image
import ipywidgets
Input parameters needed:
seq_comp: Complementary sequence of the given DNA structure (e.g. CGCGAATTCGCG)
traj: Trajectory for a 500ns Drew Dickerson Dodecamer MD simulation (taken from BigNASim)
# Input parameters
seq = "CGCGAATTCGCG"
seq_comp = "CGCGAATTCGCG"
traj = "TRAJ/structure.stripped.nc"
top = "TRAJ/structure.stripped.top"
# Auxiliar lists
grooves = ('majd','majw','mind','minw')
axis_base_pairs = ('inclin','tip','xdisp','ydisp')
base_pair = ('shear','stretch','stagger','buckle','propel','opening')
base_pair_step = ('rise','roll','twist','shift','slide','tilt')
backbone_torsions = ('alphaC', 'alphaW', 'betaC', 'betaW', 'gammaC', 'gammaW', 'deltaC', 'deltaW', \
'epsilC', 'epsilW', 'zetaC', 'zetaW', 'chiC', 'chiW', 'phaseC', 'phaseW')
Curves+ program and its associated Canal tool allow us to extract helical parameters from a DNA MD simulation.
Curves+ is a nucleic acid conformational analysis program which provides both helical and backbone parameters, including a curvilinear axis and parameters relating the position of the bases to this axis. It additionally provides a full analysis of groove widths and depths. Curves+ can also be used to analyse molecular dynamics trajectories. With the help of the accompanying program Canal, it is possible to produce a variety of graphical output including parameter variations along a given structure and time series or histograms of parameter variations during dynamics.
Conformational analysis of nucleic acids revisited: Curves+
R. Lavery, M. Moakher, J. H. Maddocks, D. Petkeviciute, K. Zakrzewska
Nucleic Acids Research, Volume 37, Issue 17, 1 September 2009, Pages 5917–5929
https://doi.org/10.1093/nar/gkp608
CURVES+ web server for analyzing and visualizing the helical, backbone and groove parameters of nucleic acid structure.
R. Lavery, M. Moakher, J.H. Maddocks, D. Petkeviciute, K. Zakrzewska.
Nucleic Acids Res 2009, 37:5917-5929
https://bisi.ibcp.fr/tools/curves_plus/
Building Blocks used:
The extraction of helical parameters is then done in two steps:
Curves+ program needs a trajectory and its associated topology, and a couple of ranges, informing about the numeration of the two DNA strands: s1range and s2range.
from biobb_dna.curvesplus.biobb_curves import curves
curves_out_lis = "curves.out.lis"
curves_out_cda = "curves.out.cda"
prop = {
's1range' : '1:12',
's2range' : '24:13'
}
curves(
input_struc_path=traj,
input_top_path=top,
output_lis_path=curves_out_lis,
output_cda_path=curves_out_cda,
properties=prop
)
2021-09-27 13:18:01,377 [MainThread ] [INFO ] Creating curves_c47f0d81-4fbe-46d4-a81f-6daaf812ce0f temporary folder 2021-09-27 13:18:01,431 [MainThread ] [INFO ] Appending provided topology to command 2021-09-27 13:18:01,432 [MainThread ] [INFO ] Creating command line with instructions and required arguments 2021-09-27 13:18:06,836 [MainThread ] [INFO ] Cur+ <<! &inp file=structure.stripped.nc, ftop=structure.stripped.top, lis='curves_output', lib=/anaconda3/envs/biobb_dna_helparms_tutorial/.curvesplus/standard, ions=.f., test=.f., line=.f., fit=.f., itst=0,itnd=0,itdel=1, &end 2 1 -1 0 0 1:12 24:13 ! 2021-09-27 13:18:06,837 [MainThread ] [INFO ] Exit code 0 2021-09-27 13:18:06,844 [MainThread ] [INFO ] Removed: curves_c47f0d81-4fbe-46d4-a81f-6daaf812ce0f
0
Canal program needs the output of the previous Curves+ execution, and is able to produce time series (series property) and histograms (histo property) for the parameter variations during dynamics.
from biobb_dna.curvesplus.biobb_canal import canal
canal_out = "canal.out.zip"
prop = {
'series' : True,
'histo' : True
}
canal(
input_cda_file=curves_out_cda,
input_lis_file=curves_out_lis,
output_zip_path=canal_out,
properties=prop
)
2021-09-27 13:18:09,862 [MainThread ] [INFO ] using sequence CGCGAATTCGCG from curves.out.lis 2021-09-27 13:18:09,864 [MainThread ] [INFO ] Creating canal_e05cf9e3-ce9c-4880-89c0-c6069d935cf4 temporary folder 2021-09-27 13:18:09,881 [MainThread ] [INFO ] Creating command line with instructions and required arguments 2021-09-27 13:18:14,292 [MainThread ] [INFO ] Canal <<! &inp lis=canal_output, cormin=0.6, lev1=0,lev2=0, itst=0,itnd=0,itdel=1, histo=.t., series=.t., corr=.f., &end curves.out.cda CGCGAATTCGCG ! 2021-09-27 13:18:14,293 [MainThread ] [INFO ] Exit code 0 2021-09-27 13:18:14,392 [MainThread ] [INFO ] Removed: canal_e05cf9e3-ce9c-4880-89c0-c6069d935cf4
0
canal_dir = "canal_out"
if Path(canal_dir).exists(): shutil.rmtree(canal_dir)
os.mkdir(canal_dir)
with zipfile.ZipFile(canal_out, 'r') as zip_ref:
zip_ref.extractall(canal_dir)
Average helical parameter values can be computed from the output of Curves+/Canal execution.
The helical parameters can be divided in 5 main blocks:
Translational (Shift, Slide, Rise) and rotational (Tilt, Roll, Twist) parameters related to a dinucleotide Inter-Base Pair (Base Pair Step).

Building Block used:
from biobb_dna.dna.dna_averages import helparaverages
helpar = 'rise'
input_file_path = "canal_out/canal_output" + "_" + helpar + ".ser"
output_averages_csv_path= helpar+'.averages.csv'
output_averages_jpg_path= helpar+'.averages.jpg'
prop = {
'helpar_name': helpar,
'sequence': seq
}
helparaverages(
input_ser_path=input_file_path,
output_csv_path=output_averages_csv_path,
output_jpg_path=output_averages_jpg_path,
properties=prop)
2021-09-27 13:18:19,851 [MainThread ] [INFO ] Creating averages_3015f27d-29fa-40b4-ab66-00d9b03d4448 temporary folder 2021-09-27 13:18:20,196 [MainThread ] [INFO ] Removed: averages_3015f27d-29fa-40b4-ab66-00d9b03d4448
0
output_averages_csv_path= helpar+'.averages.csv'
df = pd.read_csv(output_averages_csv_path)
df
| Base Pair Step | mean | std | |
|---|---|---|---|
| 0 | GC | 3.470550 | 0.377972 |
| 1 | CG | 2.978038 | 0.454661 |
| 2 | GA | 3.377096 | 0.408705 |
| 3 | AA | 3.363942 | 0.378155 |
| 4 | AT | 3.444138 | 0.355242 |
| 5 | TT | 3.374110 | 0.376290 |
| 6 | TC | 3.370114 | 0.411934 |
| 7 | CG | 2.982740 | 0.454215 |
| 8 | GC | 3.464096 | 0.390428 |
Image(filename=output_averages_jpg_path,width = 600)
from biobb_dna.dna.dna_averages import helparaverages
for helpar in base_pair_step:
input_file_path = "canal_out/canal_output" + "_" + helpar + ".ser"
output_averages_csv_path= helpar+'.averages.csv'
output_averages_jpg_path= helpar+'.averages.jpg'
prop = {
'helpar_name': helpar,
'sequence': seq
}
helparaverages(
input_ser_path=input_file_path,
output_csv_path=output_averages_csv_path,
output_jpg_path=output_averages_jpg_path,
properties=prop)
2021-09-27 13:18:31,207 [MainThread ] [INFO ] Creating averages_2ae0298f-53d5-4abf-a880-50704ca8a4d0 temporary folder 2021-09-27 13:18:31,455 [MainThread ] [INFO ] Removed: averages_2ae0298f-53d5-4abf-a880-50704ca8a4d0 2021-09-27 13:18:31,456 [MainThread ] [INFO ] Creating averages_87a28a84-cece-4b48-871c-c3b2c2025870 temporary folder 2021-09-27 13:18:31,689 [MainThread ] [INFO ] Removed: averages_87a28a84-cece-4b48-871c-c3b2c2025870 2021-09-27 13:18:31,691 [MainThread ] [INFO ] Creating averages_76ac232a-c943-4778-9bde-2d5249a39875 temporary folder 2021-09-27 13:18:31,945 [MainThread ] [INFO ] Removed: averages_76ac232a-c943-4778-9bde-2d5249a39875 2021-09-27 13:18:31,947 [MainThread ] [INFO ] Creating averages_f69a8527-812e-43b0-8bfa-4de70f01e4a5 temporary folder 2021-09-27 13:18:32,183 [MainThread ] [INFO ] Removed: averages_f69a8527-812e-43b0-8bfa-4de70f01e4a5 2021-09-27 13:18:32,184 [MainThread ] [INFO ] Creating averages_3d825e6b-6301-4a63-8145-74643fbfefe5 temporary folder 2021-09-27 13:18:32,416 [MainThread ] [INFO ] Removed: averages_3d825e6b-6301-4a63-8145-74643fbfefe5 2021-09-27 13:18:32,418 [MainThread ] [INFO ] Creating averages_851c475d-a84d-4548-a2d9-21b2e674a2b0 temporary folder 2021-09-27 13:18:32,657 [MainThread ] [INFO ] Removed: averages_851c475d-a84d-4548-a2d9-21b2e674a2b0
for helpar in base_pair_step:
output_averages_csv_path= helpar+'.averages.csv'
df = pd.read_csv(output_averages_csv_path)
print("Helical Parameter: " + helpar)
print(df)
print("---------\n")
Helical Parameter: rise Base Pair Step mean std 0 GC 3.470550 0.377972 1 CG 2.978038 0.454661 2 GA 3.377096 0.408705 3 AA 3.363942 0.378155 4 AT 3.444138 0.355242 5 TT 3.374110 0.376290 6 TC 3.370114 0.411934 7 CG 2.982740 0.454215 8 GC 3.464096 0.390428 --------- Helical Parameter: roll Base Pair Step mean std 0 GC -4.192414 8.331507 1 CG 9.446606 9.363419 2 GA 1.852456 8.644153 3 AA 0.536582 7.624018 4 AT -3.163010 7.237972 5 TT 0.534468 7.656334 6 TC 2.127450 8.682155 7 CG 9.585186 9.365877 8 GC -4.021720 8.647046 --------- Helical Parameter: twist Base Pair Step mean std 0 GC 34.088546 4.816438 1 CG 32.326028 6.989191 2 GA 35.500510 5.637390 3 AA 35.972860 4.838027 4 AT 32.721506 3.618198 5 TT 36.053014 4.974014 6 TC 35.610722 5.545855 7 CG 32.319386 6.922376 8 GC 34.228190 4.863295 --------- Helical Parameter: shift Base Pair Step mean std 0 GC 0.269898 1.002226 1 CG 0.261994 1.120109 2 GA -0.568252 0.924852 3 AA -0.303916 0.701930 4 AT 0.003008 0.560834 5 TT 0.320328 0.700562 6 TC 0.573412 0.934916 7 CG -0.315024 1.130856 8 GC -0.220348 0.982379 --------- Helical Parameter: slide Base Pair Step mean std 0 GC -0.207536 0.492641 1 CG 0.121372 0.560931 2 GA -0.023312 0.648352 3 AA -0.502846 0.563088 4 AT -0.963374 0.396783 5 TT -0.510682 0.556494 6 TC -0.011398 0.648248 7 CG 0.135720 0.561489 8 GC -0.242346 0.509866 --------- Helical Parameter: tilt Base Pair Step mean std 0 GC 0.788582 5.737113 1 CG 2.352174 6.790473 2 GA -2.433898 5.987960 3 AA -2.723502 5.435644 4 AT 0.072902 4.969859 5 TT 2.780530 5.351470 6 TC 2.466076 5.965965 7 CG -2.520270 6.909690 8 GC -0.519486 5.749269 ---------
images = []
for helpar in base_pair_step:
images.append(helpar + '.averages.jpg')
f, axarr = plt.subplots(2, 3, figsize=(40, 20))
for i, image in enumerate(images):
y = i%3
x = int(i/3)
img = mpimg.imread(image)
axarr[x,y].imshow(img, aspect='auto')
axarr[x,y].axis('off')
plt.show()
Translational (Shear, Stretch, Stagger) and rotational (Buckle, Propeller, Opening) parameters related to a dinucleotide Intra-Base Pair.

Building Block used:
from biobb_dna.dna.dna_averages import helparaverages
for helpar in base_pair:
#input_file_path = canal_out + "_" + helpar + ".ser"
input_file_path = "canal_out/canal_output" + "_" + helpar + ".ser"
output_averages_csv_path= helpar+'.averages.csv'
output_averages_jpg_path= helpar+'.averages.jpg'
prop = {
'helpar_name': helpar,
'sequence': seq
}
helparaverages(
input_ser_path=input_file_path,
output_csv_path=output_averages_csv_path,
output_jpg_path=output_averages_jpg_path,
properties=prop)
2021-09-27 13:18:41,989 [MainThread ] [INFO ] Creating averages_dd60957b-66f9-43d5-ae25-ea6bc16e5a56 temporary folder 2021-09-27 13:18:42,253 [MainThread ] [INFO ] Removed: averages_dd60957b-66f9-43d5-ae25-ea6bc16e5a56 2021-09-27 13:18:42,255 [MainThread ] [INFO ] Creating averages_efa78ba5-8a43-416f-813e-6100f4022ea2 temporary folder 2021-09-27 13:18:42,508 [MainThread ] [INFO ] Removed: averages_efa78ba5-8a43-416f-813e-6100f4022ea2 2021-09-27 13:18:42,510 [MainThread ] [INFO ] Creating averages_8cd67bae-99d0-4b47-aa53-87e0e90d7487 temporary folder 2021-09-27 13:18:42,753 [MainThread ] [INFO ] Removed: averages_8cd67bae-99d0-4b47-aa53-87e0e90d7487 2021-09-27 13:18:42,755 [MainThread ] [INFO ] Creating averages_1186e7b3-b368-4e46-9b69-df5f6cbb5de0 temporary folder 2021-09-27 13:18:43,219 [MainThread ] [INFO ] Removed: averages_1186e7b3-b368-4e46-9b69-df5f6cbb5de0 2021-09-27 13:18:43,223 [MainThread ] [INFO ] Creating averages_57be9d4a-8f39-4b82-bb82-6ad827abe8f4 temporary folder 2021-09-27 13:18:43,540 [MainThread ] [INFO ] Removed: averages_57be9d4a-8f39-4b82-bb82-6ad827abe8f4 2021-09-27 13:18:43,542 [MainThread ] [INFO ] Creating averages_77edf568-e194-4997-b126-03ef58e03140 temporary folder 2021-09-27 13:18:43,817 [MainThread ] [INFO ] Removed: averages_77edf568-e194-4997-b126-03ef58e03140
for helpar in base_pair:
output_averages_csv_path= helpar+'.averages.csv'
df = pd.read_csv(output_averages_csv_path)
print("Helical Parameter: " + helpar)
print(df)
print("---------\n")
Helical Parameter: shear Base Pair mean std 0 G -0.049992 0.437419 1 C -0.066120 0.428450 2 G 0.033414 0.438027 3 A 0.192408 0.393915 4 A 0.181268 0.382112 5 T -0.187710 0.394140 6 T -0.209416 0.391508 7 C -0.021976 0.444251 8 G 0.061500 0.432066 9 C 0.033314 0.430674 --------- Helical Parameter: stretch Base Pair mean std 0 G 0.021910 0.154373 1 C 0.057128 0.146271 2 G 0.089684 0.162121 3 A 0.073182 0.155722 4 A 0.043370 0.154155 5 T 0.046586 0.151565 6 T 0.072776 0.159554 7 C 0.088406 0.164430 8 G 0.060682 0.149506 9 C 0.026160 0.148136 --------- Helical Parameter: stagger Base Pair mean std 0 G 0.270900 0.531987 1 C 0.190716 0.524902 2 G -0.030574 0.514709 3 A 0.094458 0.515384 4 A 0.123512 0.511518 5 T 0.111454 0.501179 6 T 0.094714 0.513247 7 C -0.052344 0.516656 8 G 0.187962 0.520935 9 C 0.245116 0.527218 --------- Helical Parameter: buckle Base Pair mean std 0 G -2.011124 10.957743 1 C -2.108248 10.706547 2 G 7.351526 11.411534 3 A 6.094836 11.016860 4 A 2.021262 9.824963 5 T -1.932570 9.869768 6 T -6.454566 10.888373 7 C -7.332888 11.269869 8 G 1.551400 10.722523 9 C 1.740906 10.696502 --------- Helical Parameter: propel Base Pair mean std 0 G -8.589338 13.487527 1 C -3.200962 12.673929 2 G -4.877606 12.107475 3 A -15.895916 11.463353 4 A -17.740540 11.053182 5 T -17.600944 10.901201 6 T -15.934650 11.522282 7 C -3.809752 12.151864 8 G -3.145238 12.632066 9 C -8.300600 13.082747 --------- Helical Parameter: opening Base Pair mean std 0 G 0.519980 4.990688 1 C 0.727574 4.801118 2 G 2.788520 5.011118 3 A 2.696966 6.306307 4 A 2.434002 5.929345 5 T 2.367904 5.751322 6 T 2.837478 6.282682 7 C 2.813492 5.139189 8 G 0.822494 4.850359 9 C 0.483850 4.838646 ---------
images = []
for helpar in base_pair:
images.append(helpar + '.averages.jpg')
f, axarr = plt.subplots(2, 3, figsize=(40, 20))
for i, image in enumerate(images):
y = i%3
x = int(i/3)
img = mpimg.imread(image)
axarr[x,y].imshow(img, aspect='auto')
axarr[x,y].axis('off')
plt.show()
Translational (x/y-displacement) and rotational (inclination, tip) parameters related to a dinucleotide Base Pair.

Building Block used:
from biobb_dna.dna.dna_averages import helparaverages
for helpar in axis_base_pairs:
#input_file_path = canal_out + "_" + helpar + ".ser"
input_file_path = "canal_out/canal_output" + "_" + helpar + ".ser"
output_averages_csv_path= helpar+'.averages.csv'
output_averages_jpg_path= helpar+'.averages.jpg'
prop = {
'helpar_name': helpar,
'sequence': seq,
# 'seqpos': [4,3]
}
helparaverages(
input_ser_path=input_file_path,
output_csv_path=output_averages_csv_path,
output_jpg_path=output_averages_jpg_path,
properties=prop)
2021-09-27 13:18:52,922 [MainThread ] [INFO ] Creating averages_53d44001-d9d6-4bbe-8199-5bbbce4d949d temporary folder 2021-09-27 13:18:53,177 [MainThread ] [INFO ] Removed: averages_53d44001-d9d6-4bbe-8199-5bbbce4d949d 2021-09-27 13:18:53,180 [MainThread ] [INFO ] Creating averages_adaade6a-76d9-4c19-969e-9840ed602f25 temporary folder 2021-09-27 13:18:53,422 [MainThread ] [INFO ] Removed: averages_adaade6a-76d9-4c19-969e-9840ed602f25 2021-09-27 13:18:53,425 [MainThread ] [INFO ] Creating averages_17e6dcb0-1b50-4886-a2b2-9586f7235c45 temporary folder 2021-09-27 13:18:53,696 [MainThread ] [INFO ] Removed: averages_17e6dcb0-1b50-4886-a2b2-9586f7235c45 2021-09-27 13:18:53,699 [MainThread ] [INFO ] Creating averages_6daa0abf-4456-4730-b0ed-fcc1bead7ba6 temporary folder 2021-09-27 13:18:53,951 [MainThread ] [INFO ] Removed: averages_6daa0abf-4456-4730-b0ed-fcc1bead7ba6
for helpar in axis_base_pairs:
output_averages_csv_path= helpar+'.averages.csv'
df = pd.read_csv(output_averages_csv_path)
print("Helical Parameter: " + helpar)
print(df)
print("---------\n")
Helical Parameter: inclin Base Pair mean std 0 G 6.041284 6.230831 1 C 5.410818 5.944490 2 G 5.825528 5.359376 3 A 2.924440 5.031021 4 A 0.346356 4.709278 5 T 0.460490 4.744693 6 T 3.120780 5.069334 7 C 6.106598 5.504673 8 G 5.647446 6.043921 9 C 6.601924 6.358550 --------- Helical Parameter: tip Base Pair mean std 0 G 2.512608 5.728359 1 C -5.681794 6.177066 2 G 0.907828 6.084601 3 A 0.569286 5.347552 4 A 1.306582 4.845978 5 T -1.227768 5.019692 6 T -0.568284 5.338846 7 C -0.767708 6.048420 8 G 5.821194 6.102313 9 C -2.578986 10.627202 --------- Helical Parameter: xdisp Base Pair mean std 0 G -0.659664 0.857688 1 C -0.487874 0.832502 2 G -0.421234 0.867164 3 A -0.937412 0.731117 4 A -1.150654 0.600304 5 T -1.149698 0.604352 6 T -0.933022 0.736415 7 C -0.428964 0.875973 8 G -0.560260 0.827803 9 C -0.704972 0.883845 --------- Helical Parameter: ydisp Base Pair mean std 0 G 0.031522 0.563973 1 C -0.185588 0.531173 2 G -0.029302 0.526326 3 A 0.089948 0.507185 4 A 0.132712 0.462237 5 T -0.133432 0.454954 6 T -0.113044 0.506105 7 C 0.002278 0.526475 8 G 0.176116 0.529698 9 C -0.071754 0.554503 ---------
images = []
for helpar in axis_base_pairs:
images.append(helpar + '.averages.jpg')
f, axarr = plt.subplots(2, 2, figsize=(30, 20))
for i, image in enumerate(images):
y = i%2
x = int(i/2)
img = mpimg.imread(image)
axarr[x,y].imshow(img, aspect='auto')
axarr[x,y].axis('off')
plt.show()
Nucleic Acid Structure's strand backbones appear closer together on one side of the helix than on the other. This creates a Major groove (where backbones are far apart) and a Minor groove (where backbones are close together). Depth and width of these grooves can be mesured giving information about the different conformations that the nucleic acid structure can achieve.

Building Block used:
from biobb_dna.dna.dna_averages import helparaverages
for helpar in grooves:
#input_file_path = canal_out + "_" + helpar + ".ser"
input_file_path = "canal_out/canal_output" + "_" + helpar + ".ser"
output_averages_csv_path= helpar+'.averages.csv'
output_averages_jpg_path= helpar+'.averages.jpg'
prop = {
'helpar_name': helpar,
'sequence': seq
}
helparaverages(
input_ser_path=input_file_path,
output_csv_path=output_averages_csv_path,
output_jpg_path=output_averages_jpg_path,
properties=prop)
2021-09-27 13:19:03,134 [MainThread ] [INFO ] Creating averages_c15b1ea4-786c-48ae-9446-a1eff4c9c894 temporary folder 2021-09-27 13:19:03,386 [MainThread ] [INFO ] Removed: averages_c15b1ea4-786c-48ae-9446-a1eff4c9c894 2021-09-27 13:19:03,388 [MainThread ] [INFO ] Creating averages_14e91c7b-6423-4e8e-80c5-06dd45ad81f9 temporary folder 2021-09-27 13:19:03,624 [MainThread ] [INFO ] Removed: averages_14e91c7b-6423-4e8e-80c5-06dd45ad81f9 2021-09-27 13:19:03,627 [MainThread ] [INFO ] Creating averages_180ad0cc-7645-43ff-a425-8ceecb3ba107 temporary folder 2021-09-27 13:19:03,927 [MainThread ] [INFO ] Removed: averages_180ad0cc-7645-43ff-a425-8ceecb3ba107 2021-09-27 13:19:03,931 [MainThread ] [INFO ] Creating averages_268da860-8db6-42ca-b6f1-423ff30307d5 temporary folder 2021-09-27 13:19:04,223 [MainThread ] [INFO ] Removed: averages_268da860-8db6-42ca-b6f1-423ff30307d5
for helpar in grooves:
output_averages_csv_path= helpar+'.averages.csv'
df = pd.read_csv(output_averages_csv_path)
print("Helical Parameter: " + helpar)
print(df)
print("---------\n")
Helical Parameter: majd Base Pair Step mean std 0 GC NaN NaN 1 CG -0.703077 1.362241 2 GA 3.396398 1.402178 3 AA 5.710123 1.406443 4 AT 5.156143 1.346752 5 TT 5.139702 1.337523 6 TC 5.735556 1.447770 7 CG 3.371511 1.402279 8 GC -0.644086 1.444480 --------- Helical Parameter: majw Base Pair Step mean std 0 GC NaN NaN 1 CG 13.040615 1.300370 2 GA 12.664800 1.419105 3 AA 11.986143 1.670021 4 AT 11.476450 1.797558 5 TT 11.472435 1.740485 6 TC 11.964836 1.648435 7 CG 12.508447 1.419017 8 GC 12.819785 1.347786 --------- Helical Parameter: mind Base Pair Step mean std 0 GC NaN NaN 1 CG 3.932224 1.148757 2 GA 5.100660 1.114656 3 AA 4.926210 0.677415 4 AT 4.968002 0.453657 5 TT 4.966188 0.452291 6 TC 4.912856 0.690885 7 CG 5.121034 1.132328 8 GC 3.969312 1.021299 --------- Helical Parameter: minw Base Pair Step mean std 0 GC NaN NaN 1 CG 7.713347 1.175525 2 GA 6.789962 1.405118 3 AA 5.296306 1.332686 4 AT 4.160776 1.113765 5 TT 4.109198 1.119448 6 TC 5.376128 1.314328 7 CG 6.827374 1.402630 8 GC 7.729192 1.148585 ---------
images = []
for helpar in grooves:
images.append(helpar + '.averages.jpg')
f, axarr = plt.subplots(2, 2, figsize=(30, 20))
for i, image in enumerate(images):
y = i%2
x = int(i/2)
img = mpimg.imread(image)
axarr[x,y].imshow(img, aspect='auto')
axarr[x,y].axis('off')
plt.show()
The three major elements of flexibility in the backbone are:
Sugar Puckering annotation is done by dividing the pseudo-rotational circle in four equivalent sections:
These four conformations are those dominating sugar conformational space, in agreement with all available experimental data.
Rotations around α/γ torsions generate non-canonical local conformations leading to a reduced twist and they have been reported as being important in the formation of several protein-DNA complexes.
The concerted rotation around ζ/ε torsions generates two major conformers: BI and BII, which are experimentally known to co-exist in a ratio around 80%:20% (BI:BII) in B-DNA.
|
|
|
|
Sugar Puckering |
Canonical Alpha/Gamma |
BI/BII population |
Building Blocks used:
from biobb_dna.backbone.puckering import puckering
canal_phaseC = "canal_out/canal_output_phaseC.ser"
canal_phaseW = "canal_out/canal_output_phaseW.ser"
output_puckering_csv_path = 'puckering.averages.csv'
output_puckering_jpg_path = 'puckering.averages.jpg'
prop = {
'sequence': seq
}
puckering(
input_phaseC_path=canal_phaseC,
input_phaseW_path=canal_phaseW,
output_csv_path=output_puckering_csv_path,
output_jpg_path=output_puckering_jpg_path,
properties=prop)
2021-09-27 13:19:15,753 [MainThread ] [INFO ] Creating backbone_5244e5f8-6358-4c7a-88a9-04575e6b5ad2 temporary folder 2021-09-27 13:19:16,188 [MainThread ] [INFO ] Removed: backbone_5244e5f8-6358-4c7a-88a9-04575e6b5ad2
0
df = pd.read_csv(output_puckering_csv_path)
df
| Nucleotide | North | East | West | South | |
|---|---|---|---|---|---|
| 0 | C5'-1 | 1.82 | 8.84 | 0.02 | 89.32 |
| 1 | G-2 | 0.10 | 5.92 | 0.02 | 93.96 |
| 2 | C-3 | 3.36 | 20.02 | 0.00 | 76.60 |
| 3 | G-4 | 0.76 | 7.52 | 0.00 | 91.72 |
| 4 | A-5 | 4.16 | 6.72 | 0.00 | 89.10 |
| 5 | A-6 | 3.04 | 16.44 | 0.00 | 80.50 |
| 6 | T-7 | 1.16 | 34.06 | 0.00 | 64.76 |
| 7 | T-8 | 1.50 | 30.36 | 0.00 | 68.10 |
| 8 | C-9 | 4.00 | 24.56 | 0.00 | 71.44 |
| 9 | G-10 | 0.60 | 6.40 | 0.02 | 92.98 |
| 10 | C-11 | 1.72 | 13.38 | 0.00 | 84.90 |
| 11 | G3'-12 | 1.18 | 17.28 | 0.04 | 81.50 |
| 12 | - | 0.00 | 0.00 | 0.00 | 0.00 |
| 13 | G5'-12 | 3.44 | 12.72 | 0.02 | 83.80 |
| 14 | C-11 | 0.16 | 5.56 | 0.00 | 94.24 |
| 15 | G-10 | 3.14 | 20.70 | 0.00 | 76.14 |
| 16 | C-9 | 0.70 | 7.32 | 0.00 | 91.98 |
| 17 | T-8 | 3.64 | 7.04 | 0.00 | 89.32 |
| 18 | T-7 | 3.02 | 14.58 | 0.02 | 82.38 |
| 19 | A-6 | 0.96 | 33.82 | 0.00 | 65.22 |
| 20 | A-5 | 1.00 | 28.42 | 0.00 | 70.58 |
| 21 | G-4 | 4.00 | 23.96 | 0.02 | 71.98 |
| 22 | C-3 | 0.24 | 7.14 | 0.02 | 92.60 |
| 23 | G-2 | 1.72 | 12.86 | 0.00 | 85.40 |
| 24 | C3'-1 | 1.40 | 16.46 | 0.00 | 82.12 |
Image(filename=output_puckering_jpg_path,width = 600)
from biobb_dna.backbone.canonicalag import canonicalag
canal_alphaC = "canal_out/canal_output_alphaC.ser"
canal_alphaW = "canal_out/canal_output_alphaW.ser"
canal_gammaC = "canal_out/canal_output_gammaC.ser"
canal_gammaW = "canal_out/canal_output_gammaW.ser"
output_alphagamma_csv_path = 'alphagamma.averages.csv'
output_alphagamma_jpg_path = 'alphagamma.averages.jpg'
prop = {
'sequence': seq
}
canonicalag(
input_alphaC_path=canal_alphaC,
input_alphaW_path=canal_alphaW,
input_gammaC_path=canal_gammaC,
input_gammaW_path=canal_gammaW,
output_csv_path=output_alphagamma_csv_path,
output_jpg_path=output_alphagamma_jpg_path,
properties=prop)
2021-09-27 13:19:21,553 [MainThread ] [INFO ] Creating backbone_e9bb4f27-b57a-4103-aed9-ca2a8846d895 temporary folder 2021-09-27 13:19:21,956 [MainThread ] [INFO ] Removed: backbone_e9bb4f27-b57a-4103-aed9-ca2a8846d895
0
df = pd.read_csv(output_alphagamma_csv_path)
df
| Nucleotide | Canonical alpha/gamma | |
|---|---|---|
| 0 | C5'-1 | 0.00 |
| 1 | G-2 | 96.76 |
| 2 | C-3 | 89.66 |
| 3 | G-4 | 92.70 |
| 4 | A-5 | 98.30 |
| 5 | A-6 | 94.52 |
| 6 | T-7 | 99.56 |
| 7 | T-8 | 97.86 |
| 8 | C-9 | 99.54 |
| 9 | G-10 | 99.64 |
| 10 | C-11 | 99.72 |
| 11 | G3'-12 | 98.24 |
| 12 | - | 0.00 |
| 13 | G5'-12 | 0.00 |
| 14 | C-11 | 95.42 |
| 15 | G-10 | 93.56 |
| 16 | C-9 | 95.40 |
| 17 | T-8 | 99.38 |
| 18 | T-7 | 95.36 |
| 19 | A-6 | 98.28 |
| 20 | A-5 | 96.70 |
| 21 | G-4 | 93.86 |
| 22 | C-3 | 98.70 |
| 23 | G-2 | 100.00 |
| 24 | C3'-1 | 99.78 |
Image(filename=output_alphagamma_jpg_path,width = 600)
from biobb_dna.backbone.bipopulations import bipopulations
canal_epsilC = "canal_out/canal_output_epsilC.ser"
canal_epsilW = "canal_out/canal_output_epsilW.ser"
canal_zetaC = "canal_out/canal_output_zetaC.ser"
canal_zetaW = "canal_out/canal_output_zetaW.ser"
output_bIbII_csv_path = 'bIbII.averages.csv'
output_bIbII_jpg_path = 'bIbII.averages.jpg'
prop = {
'sequence': seq
}
bipopulations(
input_epsilC_path=canal_epsilC,
input_epsilW_path=canal_epsilW,
input_zetaC_path=canal_zetaC,
input_zetaW_path=canal_zetaW,
output_csv_path=output_bIbII_csv_path,
output_jpg_path=output_bIbII_jpg_path,
properties=prop)
2021-09-27 13:19:27,213 [MainThread ] [INFO ] Creating backbone_78744083-e4a8-4939-97db-4df52d89dce1 temporary folder 2021-09-27 13:19:27,609 [MainThread ] [INFO ] Removed: backbone_78744083-e4a8-4939-97db-4df52d89dce1
0
df = pd.read_csv(output_bIbII_csv_path)
df
| Nucleotide | BI population | BII population | |
|---|---|---|---|
| 0 | C5'-1 | 83.523295 | 16.476705 |
| 1 | G-2 | 74.085183 | 25.914817 |
| 2 | C-3 | 85.982803 | 14.017197 |
| 3 | G-4 | 75.924815 | 24.075185 |
| 4 | A-5 | 67.846431 | 32.153569 |
| 5 | A-6 | 59.588082 | 40.411918 |
| 6 | T-7 | 65.406919 | 34.593081 |
| 7 | T-8 | 75.524895 | 24.475105 |
| 8 | C-9 | 79.504099 | 20.495901 |
| 9 | G-10 | 77.644471 | 22.355529 |
| 10 | C-11 | 82.003599 | 17.996401 |
| 11 | G3'-12 | 0.000000 | 100.000000 |
| 12 | - | 0.000000 | 100.000000 |
| 13 | G5'-12 | 82.443511 | 17.556489 |
| 14 | C-11 | 73.305339 | 26.694661 |
| 15 | G-10 | 84.483103 | 15.516897 |
| 16 | C-9 | 74.305139 | 25.694861 |
| 17 | T-8 | 67.606479 | 32.393521 |
| 18 | T-7 | 59.568086 | 40.431914 |
| 19 | A-6 | 67.126575 | 32.873425 |
| 20 | A-5 | 76.984603 | 23.015397 |
| 21 | G-4 | 80.243951 | 19.756049 |
| 22 | C-3 | 77.604479 | 22.395521 |
| 23 | G-2 | 81.443711 | 18.556289 |
| 24 | C3'-1 | 0.000000 | 100.000000 |
Image(filename=output_bIbII_jpg_path,width = 600)
Time series values for the set of helical parameters can be also extracted from the output of Curves+/Canal execution on Molecular Dynamics Trajectories. The helical parameters can be divided in the same 5 main blocks previously introduced:
Building Block used:
Time series values can be extracted from any of the helical parameters previously introduced. To illustrate the steps needed, the base-pair step helical parameter Twist has been selected. Please note that computing the time series values for a different helical parameter just requires modifying the helpar variable from the next cell.
from biobb_dna.dna.dna_timeseries import helpartimeseries
# Modify the next variable to extract time series values for a different helical parameter
# Possible values are:
# Base Pair Step (Inter Base Pair) Helical Parameters: shift, slide, rise, tilt, roll, twist
# Base Pair (Intra Base Pair) Helical Parameters: shear, stretch, stagger, buckle, propeller, opening
# Axis Parameters: inclin, tip, xdisp, ydisp
# Backbone Torsions Parameters: alphaC, alphaW, betaC, betaW, gammaC, gammaW, deltaC, deltaW,
# epsilC, epsilW, zetaC, zetaW, chiC, chiW, phaseC, phaseW
# Grooves: mind, minw, majd, majw
helpar = "twist" # Modify this variable to extract time series values for a different helical parameter
input_file_path = "canal_out/canal_output" + "_" + helpar + ".ser"
output_timeseries_file_path = helpar + '.timeseries.zip'
prop = {
'helpar_name': helpar,
'sequence': seq
}
helpartimeseries(
input_ser_path=input_file_path,
output_zip_path=output_timeseries_file_path,
properties=prop)
2021-09-27 13:19:34,048 [MainThread ] [INFO ] Creating timeseries_9e791bdb-4d09-4bfe-bbcf-5132cfdb931a temporary folder 2021-09-27 13:19:38,699 [MainThread ] [INFO ] Removed: timeseries_9e791bdb-4d09-4bfe-bbcf-5132cfdb931a
0
timeseries_dir = "timeseries"
if Path(timeseries_dir).exists(): shutil.rmtree(timeseries_dir)
os.mkdir(timeseries_dir)
with zipfile.ZipFile(output_timeseries_file_path, 'r') as zip_ref:
zip_ref.extractall(timeseries_dir)
Discover all the possible nucleotide / base / base-pair / base-pair steps from the sequence. The unit will depend on the helical parameter being studied.
Select one of them to study the time series values of the helical parameter along the simulation.
helpartimesfiles = glob.glob(timeseries_dir + "/*series*.csv")
helpartimes = []
for file in helpartimesfiles:
new_string = file.replace(timeseries_dir + "/series_" + helpar + "_", "")
new_string = new_string.replace(".csv" , "")
helpartimes.append(new_string)
timesel = ipywidgets.Dropdown(
options=helpartimes,
description='Sel. BPS:',
disabled=False,
)
display(timesel)
file_ser = timeseries_dir + "/series_" + helpar + "_" + timesel.value + ".csv"
df = pd.read_csv(file_ser)
df
| Unnamed: 0 | 6_TT | |
|---|---|---|
| 0 | 0 | 24.83 |
| 1 | 1 | 36.78 |
| 2 | 2 | 41.00 |
| 3 | 3 | 33.50 |
| 4 | 4 | 29.68 |
| ... | ... | ... |
| 4995 | 4995 | 42.40 |
| 4996 | 4996 | 44.61 |
| 4997 | 4997 | 39.61 |
| 4998 | 4998 | 36.93 |
| 4999 | 4999 | 36.63 |
5000 rows × 2 columns
file_hist = timeseries_dir + "/hist_" + helpar + "_" + timesel.value + ".csv"
df = pd.read_csv(file_hist)
df
| twist | density | |
|---|---|---|
| 0 | 14.410000 | 1.0 |
| 1 | 15.203617 | 0.0 |
| 2 | 15.997234 | 2.0 |
| 3 | 16.790851 | 1.0 |
| 4 | 17.584468 | 1.0 |
| 5 | 18.378085 | 1.0 |
| 6 | 19.171702 | 5.0 |
| 7 | 19.965319 | 3.0 |
| 8 | 20.758936 | 5.0 |
| 9 | 21.552553 | 11.0 |
| 10 | 22.346170 | 10.0 |
| 11 | 23.139787 | 21.0 |
| 12 | 23.933404 | 20.0 |
| 13 | 24.727021 | 34.0 |
| 14 | 25.520638 | 43.0 |
| 15 | 26.314255 | 61.0 |
| 16 | 27.107872 | 80.0 |
| 17 | 27.901489 | 91.0 |
| 18 | 28.695106 | 114.0 |
| 19 | 29.488723 | 138.0 |
| 20 | 30.282340 | 176.0 |
| 21 | 31.075957 | 181.0 |
| 22 | 31.869574 | 208.0 |
| 23 | 32.663191 | 248.0 |
| 24 | 33.456809 | 265.0 |
| 25 | 34.250426 | 280.0 |
| 26 | 35.044043 | 300.0 |
| 27 | 35.837660 | 327.0 |
| 28 | 36.631277 | 321.0 |
| 29 | 37.424894 | 304.0 |
| 30 | 38.218511 | 270.0 |
| 31 | 39.012128 | 298.0 |
| 32 | 39.805745 | 243.0 |
| 33 | 40.599362 | 234.0 |
| 34 | 41.392979 | 196.0 |
| 35 | 42.186596 | 169.0 |
| 36 | 42.980213 | 115.0 |
| 37 | 43.773830 | 78.0 |
| 38 | 44.567447 | 47.0 |
| 39 | 45.361064 | 39.0 |
| 40 | 46.154681 | 18.0 |
| 41 | 46.948298 | 17.0 |
| 42 | 47.741915 | 10.0 |
| 43 | 48.535532 | 9.0 |
| 44 | 49.329149 | 2.0 |
| 45 | 50.122766 | 1.0 |
| 46 | 50.916383 | 2.0 |
file_ser = timeseries_dir + "/series_" + helpar + "_" + timesel.value + ".jpg"
file_hist = timeseries_dir + "/hist_" + helpar + "_" + timesel.value + ".jpg"
images = []
images.append(file_ser)
images.append(file_hist)
f, axarr = plt.subplots(1, 2, figsize=(50, 15))
for i, image in enumerate(images):
img = mpimg.imread(image)
axarr[i].imshow(img, aspect='auto')
axarr[i].axis('off')
plt.show()
from biobb_dna.dna.dna_timeseries import helpartimeseries
output_timeseries_bps_file_paths = {}
for helpar in base_pair_step:
input_file_path = "canal_out/canal_output" + "_" + helpar + ".ser"
output_timeseries_bps_file_paths[helpar] = helpar + '.timeseries.zip'
prop = {
'helpar_name': helpar,
'sequence': seq
}
helpartimeseries(
input_ser_path=input_file_path,
output_zip_path=output_timeseries_bps_file_paths[helpar],
properties=prop)
2021-09-27 13:20:00,411 [MainThread ] [INFO ] Creating timeseries_9a4296b2-bd8b-4883-bb28-2da55b0b0540 temporary folder 2021-09-27 13:20:04,847 [MainThread ] [INFO ] Removed: timeseries_9a4296b2-bd8b-4883-bb28-2da55b0b0540 2021-09-27 13:20:04,851 [MainThread ] [INFO ] Creating timeseries_0b4e315a-4be8-4fcb-afd0-f7eace4cf833 temporary folder 2021-09-27 13:20:09,372 [MainThread ] [INFO ] Removed: timeseries_0b4e315a-4be8-4fcb-afd0-f7eace4cf833 2021-09-27 13:20:09,376 [MainThread ] [INFO ] Creating timeseries_32b77dac-c2aa-4c42-9efa-5e0e37d01b89 temporary folder 2021-09-27 13:20:13,811 [MainThread ] [INFO ] Removed: timeseries_32b77dac-c2aa-4c42-9efa-5e0e37d01b89 2021-09-27 13:20:13,814 [MainThread ] [INFO ] Creating timeseries_90682e40-3fba-4697-99eb-01cb9ef1c90d temporary folder 2021-09-27 13:20:18,328 [MainThread ] [INFO ] Removed: timeseries_90682e40-3fba-4697-99eb-01cb9ef1c90d 2021-09-27 13:20:18,336 [MainThread ] [INFO ] Creating timeseries_1e464da3-07d5-44c5-805c-fbbaf40276d0 temporary folder 2021-09-27 13:20:22,819 [MainThread ] [INFO ] Removed: timeseries_1e464da3-07d5-44c5-805c-fbbaf40276d0 2021-09-27 13:20:22,823 [MainThread ] [INFO ] Creating timeseries_fff39926-2ef6-4857-924a-8c93d1eab161 temporary folder 2021-09-27 13:20:27,054 [MainThread ] [INFO ] Removed: timeseries_fff39926-2ef6-4857-924a-8c93d1eab161
#if Path(timeseries_dir).exists(): shutil.rmtree(timeseries_dir)
#os.mkdir(timeseries_dir)
for timeseries_zipfile in output_timeseries_bps_file_paths.values():
with zipfile.ZipFile(timeseries_zipfile, 'r') as zip_ref:
zip_ref.extractall(timeseries_dir)
from biobb_dna.dna.dna_timeseries import helpartimeseries
output_timeseries_bp_file_paths = {}
for helpar in base_pair:
input_file_path = "canal_out/canal_output" + "_" + helpar + ".ser"
output_timeseries_bp_file_paths[helpar] = helpar + '.timeseries.zip'
prop = {
'helpar_name': helpar,
'sequence': seq
}
helpartimeseries(
input_ser_path=input_file_path,
output_zip_path=output_timeseries_bp_file_paths[helpar],
properties=prop)
2021-09-27 13:20:30,659 [MainThread ] [INFO ] Creating timeseries_97e9ea00-d0ad-4149-9694-233bed0eb7c5 temporary folder 2021-09-27 13:20:35,825 [MainThread ] [INFO ] Removed: timeseries_97e9ea00-d0ad-4149-9694-233bed0eb7c5 2021-09-27 13:20:35,830 [MainThread ] [INFO ] Creating timeseries_15144a5c-bd5b-435c-9ad9-397d7de7c4ed temporary folder 2021-09-27 13:20:40,221 [MainThread ] [INFO ] Removed: timeseries_15144a5c-bd5b-435c-9ad9-397d7de7c4ed 2021-09-27 13:20:40,225 [MainThread ] [INFO ] Creating timeseries_e8344bfc-3c82-4121-b5b5-ae947e48e6a6 temporary folder 2021-09-27 13:20:44,596 [MainThread ] [INFO ] Removed: timeseries_e8344bfc-3c82-4121-b5b5-ae947e48e6a6 2021-09-27 13:20:44,600 [MainThread ] [INFO ] Creating timeseries_204e16cf-6103-4690-8cfb-a8df944de4a5 temporary folder 2021-09-27 13:20:48,920 [MainThread ] [INFO ] Removed: timeseries_204e16cf-6103-4690-8cfb-a8df944de4a5 2021-09-27 13:20:48,924 [MainThread ] [INFO ] Creating timeseries_a67b16ac-df96-4d56-9a33-a7216e2876b9 temporary folder 2021-09-27 13:20:53,286 [MainThread ] [INFO ] Removed: timeseries_a67b16ac-df96-4d56-9a33-a7216e2876b9 2021-09-27 13:20:53,290 [MainThread ] [INFO ] Creating timeseries_f13446e8-e795-46b8-b98e-3e73fa96f1d1 temporary folder 2021-09-27 13:20:57,769 [MainThread ] [INFO ] Removed: timeseries_f13446e8-e795-46b8-b98e-3e73fa96f1d1
#if Path(timeseries_dir).exists(): shutil.rmtree(timeseries_dir)
#os.mkdir(timeseries_dir)
for timeseries_zipfile in output_timeseries_bp_file_paths.values():
with zipfile.ZipFile(timeseries_zipfile, 'r') as zip_ref:
zip_ref.extractall(timeseries_dir)
from biobb_dna.dna.dna_timeseries import helpartimeseries
output_timeseries_bp_file_paths = {}
for helpar in axis_base_pairs:
input_file_path = "canal_out/canal_output" + "_" + helpar + ".ser"
output_timeseries_bp_file_paths[helpar] = helpar + '.timeseries.zip'
prop = {
'helpar_name': helpar,
'sequence': seq
}
helpartimeseries(
input_ser_path=input_file_path,
output_zip_path=output_timeseries_bp_file_paths[helpar],
properties=prop)
2021-09-27 13:21:04,407 [MainThread ] [INFO ] Creating timeseries_ce516234-186d-424d-af49-702fda21c825 temporary folder 2021-09-27 13:21:08,652 [MainThread ] [INFO ] Removed: timeseries_ce516234-186d-424d-af49-702fda21c825 2021-09-27 13:21:08,656 [MainThread ] [INFO ] Creating timeseries_6dea4243-aab4-4ced-9e52-93eb4e8a41dd temporary folder 2021-09-27 13:21:13,708 [MainThread ] [INFO ] Removed: timeseries_6dea4243-aab4-4ced-9e52-93eb4e8a41dd 2021-09-27 13:21:13,711 [MainThread ] [INFO ] Creating timeseries_bb5fc765-1412-494b-b5ef-c6dfea8fd575 temporary folder 2021-09-27 13:21:17,887 [MainThread ] [INFO ] Removed: timeseries_bb5fc765-1412-494b-b5ef-c6dfea8fd575 2021-09-27 13:21:17,891 [MainThread ] [INFO ] Creating timeseries_660e36f9-0345-4df3-9552-d9f05c6b953f temporary folder 2021-09-27 13:21:22,141 [MainThread ] [INFO ] Removed: timeseries_660e36f9-0345-4df3-9552-d9f05c6b953f
for timeseries_zipfile in output_timeseries_bp_file_paths.values():
with zipfile.ZipFile(timeseries_zipfile, 'r') as zip_ref:
zip_ref.extractall(timeseries_dir)
from biobb_dna.dna.dna_timeseries import helpartimeseries
output_timeseries_bp_file_paths = {}
for helpar in grooves:
input_file_path = "canal_out/canal_output" + "_" + helpar + ".ser"
output_timeseries_bp_file_paths[helpar] = helpar + '.timeseries.zip'
prop = {
'helpar_name': helpar,
'sequence': seq
}
helpartimeseries(
input_ser_path=input_file_path,
output_zip_path=output_timeseries_bp_file_paths[helpar],
properties=prop)
2021-09-27 13:21:27,184 [MainThread ] [INFO ] Creating timeseries_036d33f3-a097-4996-bd40-36fd4b319def temporary folder 2021-09-27 13:21:31,773 [MainThread ] [INFO ] Removed: timeseries_036d33f3-a097-4996-bd40-36fd4b319def 2021-09-27 13:21:31,777 [MainThread ] [INFO ] Creating timeseries_50b7d0ca-1720-45e3-a0b5-678bca35bfdb temporary folder 2021-09-27 13:21:36,100 [MainThread ] [INFO ] Removed: timeseries_50b7d0ca-1720-45e3-a0b5-678bca35bfdb 2021-09-27 13:21:36,105 [MainThread ] [INFO ] Creating timeseries_990144e0-671d-42ae-a9c8-c0f853fd7ac0 temporary folder 2021-09-27 13:21:40,155 [MainThread ] [INFO ] Removed: timeseries_990144e0-671d-42ae-a9c8-c0f853fd7ac0 2021-09-27 13:21:40,159 [MainThread ] [INFO ] Creating timeseries_ede53585-21cf-4009-8833-2ed12ef5977e temporary folder 2021-09-27 13:21:44,201 [MainThread ] [INFO ] Removed: timeseries_ede53585-21cf-4009-8833-2ed12ef5977e
for timeseries_zipfile in output_timeseries_bp_file_paths.values():
with zipfile.ZipFile(timeseries_zipfile, 'r') as zip_ref:
zip_ref.extractall(timeseries_dir)
from biobb_dna.dna.dna_timeseries import helpartimeseries
output_timeseries_bp_file_paths = {}
for helpar in backbone_torsions:
input_file_path = "canal_out/canal_output" + "_" + helpar + ".ser"
output_timeseries_bp_file_paths[helpar] = helpar + '.timeseries.zip'
prop = {
'helpar_name': helpar,
'sequence': seq
}
helpartimeseries(
input_ser_path=input_file_path,
output_zip_path=output_timeseries_bp_file_paths[helpar],
properties=prop)
2021-09-27 13:21:47,099 [MainThread ] [INFO ] Creating timeseries_2a36ce9d-50b6-4918-90bf-113b6d0a5f62 temporary folder 2021-09-27 13:21:52,608 [MainThread ] [INFO ] Removed: timeseries_2a36ce9d-50b6-4918-90bf-113b6d0a5f62 2021-09-27 13:21:52,612 [MainThread ] [INFO ] Creating timeseries_f79d8662-fa82-4bef-aa21-28f13e06a591 temporary folder 2021-09-27 13:21:57,791 [MainThread ] [INFO ] Removed: timeseries_f79d8662-fa82-4bef-aa21-28f13e06a591 2021-09-27 13:21:57,797 [MainThread ] [INFO ] Creating timeseries_bb316d65-f694-4f27-95b5-2a4144b09965 temporary folder 2021-09-27 13:22:02,113 [MainThread ] [INFO ] Removed: timeseries_bb316d65-f694-4f27-95b5-2a4144b09965 2021-09-27 13:22:02,118 [MainThread ] [INFO ] Creating timeseries_d9fe9c10-e81b-43e7-9d1e-b86d5375336e temporary folder 2021-09-27 13:22:06,698 [MainThread ] [INFO ] Removed: timeseries_d9fe9c10-e81b-43e7-9d1e-b86d5375336e 2021-09-27 13:22:06,713 [MainThread ] [INFO ] Creating timeseries_c451d001-cc7b-4af0-8897-8e2b62df1f75 temporary folder 2021-09-27 13:22:12,298 [MainThread ] [INFO ] Removed: timeseries_c451d001-cc7b-4af0-8897-8e2b62df1f75 2021-09-27 13:22:12,302 [MainThread ] [INFO ] Creating timeseries_163ed75c-0b7e-447d-800d-cddd17af0018 temporary folder 2021-09-27 13:22:18,148 [MainThread ] [INFO ] Removed: timeseries_163ed75c-0b7e-447d-800d-cddd17af0018 2021-09-27 13:22:18,152 [MainThread ] [INFO ] Creating timeseries_17f5d496-41e2-4d30-b057-4d2b52d02f44 temporary folder 2021-09-27 13:22:22,778 [MainThread ] [INFO ] Removed: timeseries_17f5d496-41e2-4d30-b057-4d2b52d02f44 2021-09-27 13:22:22,787 [MainThread ] [INFO ] Creating timeseries_282b06ae-def9-4a48-be3f-0d2b591d754a temporary folder 2021-09-27 13:22:27,024 [MainThread ] [INFO ] Removed: timeseries_282b06ae-def9-4a48-be3f-0d2b591d754a 2021-09-27 13:22:27,029 [MainThread ] [INFO ] Creating timeseries_f6902a77-034b-4c4d-aba2-f31643808a11 temporary folder 2021-09-27 13:22:31,933 [MainThread ] [INFO ] Removed: timeseries_f6902a77-034b-4c4d-aba2-f31643808a11 2021-09-27 13:22:31,939 [MainThread ] [INFO ] Creating timeseries_d8a06099-b584-43bc-a5f9-c5246ada7512 temporary folder 2021-09-27 13:22:37,115 [MainThread ] [INFO ] Removed: timeseries_d8a06099-b584-43bc-a5f9-c5246ada7512 2021-09-27 13:22:37,120 [MainThread ] [INFO ] Creating timeseries_8498db01-5f4c-436d-baaa-e3bde771682c temporary folder 2021-09-27 13:22:42,140 [MainThread ] [INFO ] Removed: timeseries_8498db01-5f4c-436d-baaa-e3bde771682c 2021-09-27 13:22:42,145 [MainThread ] [INFO ] Creating timeseries_fd2409f0-3f07-4378-b6b0-bccf8a1554d5 temporary folder 2021-09-27 13:22:48,350 [MainThread ] [INFO ] Removed: timeseries_fd2409f0-3f07-4378-b6b0-bccf8a1554d5 2021-09-27 13:22:48,355 [MainThread ] [INFO ] Creating timeseries_39c85b91-9003-4672-abb4-5e5ecaf2f67f temporary folder 2021-09-27 13:22:52,938 [MainThread ] [INFO ] Removed: timeseries_39c85b91-9003-4672-abb4-5e5ecaf2f67f 2021-09-27 13:22:52,947 [MainThread ] [INFO ] Creating timeseries_63de816c-123a-4d33-8e90-d418766c17a0 temporary folder 2021-09-27 13:22:57,560 [MainThread ] [INFO ] Removed: timeseries_63de816c-123a-4d33-8e90-d418766c17a0 2021-09-27 13:22:57,568 [MainThread ] [INFO ] Creating timeseries_6e3f97cd-0918-4a47-8e30-29c6d6829ac1 temporary folder 2021-09-27 13:23:02,412 [MainThread ] [INFO ] Removed: timeseries_6e3f97cd-0918-4a47-8e30-29c6d6829ac1 2021-09-27 13:23:02,418 [MainThread ] [INFO ] Creating timeseries_6640e832-8fb9-41ce-a50a-7e27bf851971 temporary folder 2021-09-27 13:23:08,251 [MainThread ] [INFO ] Removed: timeseries_6640e832-8fb9-41ce-a50a-7e27bf851971
for timeseries_zipfile in output_timeseries_bp_file_paths.values():
with zipfile.ZipFile(timeseries_zipfile, 'r') as zip_ref:
zip_ref.extractall(timeseries_dir)
Molecular stiffness is an elastic force constant associated with helical deformation at the base pair step level and is determined by inversion of the covariance matrix in helical space, which yields stiffness matrices whose diagonal elements provide the stiffness constants associated with pure rotational (twist, roll and tilt) and translational (rise, shift and slide) deformations within the given step.

Building Blocks used:
from biobb_dna.stiffness.average_stiffness import averagestiffness
helpar = "twist" # Modify this variable to extract time series values for a different helical parameter
input_file_path = "canal_out/canal_output" + "_" + helpar + ".ser"
output_stiffness_csv_path = helpar + '.stiffness.csv'
output_stiffness_jpg_path = helpar + '.stiffness.jpg'
prop = {
'sequence' : seq
}
averagestiffness(
input_ser_path=input_file_path,
output_csv_path=output_stiffness_csv_path,
output_jpg_path=output_stiffness_jpg_path,
properties=prop)
2021-09-27 13:23:12,944 [MainThread ] [INFO ] Creating avgstiffness_6488d66d-3b6e-41b3-b6cb-6fb6cd53a519 temporary folder 2021-09-27 13:23:13,286 [MainThread ] [INFO ] Removed: avgstiffness_6488d66d-3b6e-41b3-b6cb-6fb6cd53a519
0
df = pd.read_csv(output_stiffness_csv_path)
df
| Unnamed: 0 | twist_stiffness | |
|---|---|---|
| 0 | GC | 0.029784 |
| 1 | CG | 0.018300 |
| 2 | GA | 0.028595 |
| 3 | AA | 0.033391 |
| 4 | AT | 0.057062 |
| 5 | TT | 0.031410 |
| 6 | TC | 0.029772 |
| 7 | CG | 0.019181 |
| 8 | GC | 0.031688 |
| 9 | CG | 0.005167 |
Image(filename=output_stiffness_jpg_path,width = 600)
from biobb_dna.stiffness.basepair_stiffness import bpstiffness
timeseries_shift = timeseries_dir+"/series_shift_" + timesel.value + ".csv"
timeseries_slide = timeseries_dir+"/series_slide_" + timesel.value + ".csv"
timeseries_rise = timeseries_dir+"/series_rise_" + timesel.value + ".csv"
timeseries_tilt = timeseries_dir+"/series_tilt_" + timesel.value + ".csv"
timeseries_roll = timeseries_dir+"/series_roll_" + timesel.value + ".csv"
timeseries_twist = timeseries_dir+"/series_twist_" + timesel.value + ".csv"
output_stiffness_bps_csv_path = "stiffness_bps.csv"
output_stiffness_bps_jpg_path = "stiffness_bps.jpg"
bpstiffness(
input_filename_shift=timeseries_shift,
input_filename_slide=timeseries_slide,
input_filename_rise=timeseries_rise,
input_filename_tilt=timeseries_tilt,
input_filename_roll=timeseries_roll,
input_filename_twist=timeseries_twist,
output_csv_path=output_stiffness_bps_csv_path,
output_jpg_path=output_stiffness_bps_jpg_path)
2021-09-27 13:23:16,582 [MainThread ] [INFO ] Creating bp_stiffness_b9f7b874-5201-4c76-adc1-894633c62330 temporary folder 2021-09-27 13:23:16,961 [MainThread ] [INFO ] Removed: bp_stiffness_b9f7b874-5201-4c76-adc1-894633c62330
0
df = pd.read_csv(output_stiffness_bps_csv_path)
df
| 6_TT | shift | slide | rise | tilt | roll | twist | |
|---|---|---|---|---|---|---|---|
| 0 | shift | 1.565629 | -0.750135 | -0.335147 | -0.263984 | -0.153248 | -0.220935 |
| 1 | slide | -0.750135 | 3.535266 | 1.335823 | -0.095199 | -0.287142 | -2.104566 |
| 2 | rise | -0.335147 | 1.335823 | 5.158742 | 0.637038 | 0.415877 | -1.293672 |
| 3 | tilt | -0.024904 | -0.008981 | 0.060098 | 0.235447 | 0.015406 | -0.008404 |
| 4 | roll | -0.014457 | -0.027089 | 0.039234 | 0.015406 | 0.119149 | 0.038313 |
| 5 | twist | -0.020843 | -0.198544 | -0.122045 | -0.008404 | 0.038313 | 0.412179 |
Image(filename=output_stiffness_bps_jpg_path,width = 600)
Base-pair steps helical parameters usually follow a normal (Gaussian-like) distribution. However, recent studies observed bimodal distributions in some base-pair steps for twist and slide, highlighting potential caveats on the harmonic approximation implicit in elastic analysis and mesoscopic models of DNA flexibility.

μABC: a systematic microsecond molecular dynamics study of tetranucleotide sequence effects in B-DNA
Marco Pasi, John H Maddocks, David Beveridge, Thomas C Bishop, David A Case, Thomas Cheatham 3rd, Pablo D Dans, B Jayaram, Filip Lankas, Charles Laughton, Jonathan Mitchell, Roman Osman, Modesto Orozco, Alberto Pérez, Daiva Petkevičiūtė, Nada Spackova, Jiri Sponer, Krystyna Zakrzewska, Richard Lavery
Nucleic Acids Research 2014, Volume 42, Issue 19, Pages 12272-12283
https://doi.org/10.1093/nar/gku855
Exploring polymorphisms in B-DNA helical conformations
Pablo D Dans, Alberto Pérez, Ignacio Faustino, Richard Lavery, Modesto Orozco
Nucleic Acids Research 2012, Volume 40, Issue 21, Pages 10668-10678
https://doi.org/10.1093/nar/gks884
A systematic molecular dynamics study of nearest-neighbor effects on base pair and base pair step conformations and fluctuations in B-DNA
Lavery R, Zakrzewska K, Beveridge D, Bishop TC, Case DA, Cheatham T, III, Dixit S, Jayaram B, Lankas F, Laughton C, John H Maddocks, Alexis Michon, Roman Osman, Modesto Orozco, Alberto Perez, Tanya Singh, Nada Spackova, Jiri Sponer
Nucleic Acids Research 2010, Volume 38, Pages 299–313
https://doi.org/10.1093/nar/gkp834
Building Block used:
from biobb_dna.dna.dna_bimodality import helparbimodality
helpar = "twist"
input_csv = timeseries_dir+"/series_"+helpar+"_"+timesel.value+'.csv'
#input_csv = "/Users/hospital/biobb_tutorials/biobb_dna/timeseries"+"/series_"+timesel.value+'.csv' # <-- TO BE REPLACED BY PREVIOUS LINE
output_bimodality_csv = helpar+'.bimodality.csv'
output_bimodality_jpg = helpar+'.bimodality.jpg'
prop = {
'max_iter': 500
}
helparbimodality(
input_csv_file=input_csv,
output_csv_path=output_bimodality_csv,
output_jpg_path=output_bimodality_jpg,
properties=prop)
2021-09-27 13:23:23,477 [MainThread ] [INFO ] Creating bimodality_0f1f6e57-c258-47a7-8ddc-c1caba7640ce temporary folder 2021-09-27 13:23:24,178 [MainThread ] [INFO ] Removed: bimodality_0f1f6e57-c258-47a7-8ddc-c1caba7640ce
0
file_hist = timeseries_dir + "/hist_" + helpar + "_" + timesel.value + ".jpg"
file_bi = helpar + ".bimodality.jpg"
images = []
images.append(file_hist)
images.append(file_bi)
f, axarr = plt.subplots(1, 2, figsize=(50, 15))
for i, image in enumerate(images):
img = mpimg.imread(image)
axarr[i].imshow(img, aspect='auto')
axarr[i].axis('off')
plt.show()
Sequence-dependent correlation movements have been identified in DNA conformational analysis at the base pair and base pair-step level. Trinucleotides were found to show moderate to high correlations in some intra base pair helical parameter (e.g. shear-opening, shear-stretch, stagger-buckle). Similarly, some tetranucleotides are showing strong correlations in their inter base pair helical parameters (e.g. shift-tilt, slide-twist, rise-tilt, shift-slide, and shift-twist in RR steps), with negative correlations in the shift-slide and roll-twist cases. Correlations are also observed in the combination of inter- and intra-helical parameters (e.g. shift-opening, rise-buckle, stagger-tilt). Correlations analysis can be also extended to neighboring steps (e.g. twist in the central YR step of XYRR tetranucleotides with slide in the adjacent RR step).

The static and dynamic structural heterogeneities of B-DNA: extending Calladine–Dickerson rules
Pablo D Dans, Alexandra Balaceanu, Marco Pasi, Alessandro S Patelli, Daiva Petkevičiūtė, Jürgen Walther, Adam Hospital, Genís Bayarri, Richard Lavery, John H Maddocks, Modesto Orozco
Nucleic Acids Research 2019, Volume 47, Issue 21, Pages 11090-11102
https://doi.org/10.1093/nar/gkz905
Building Blocks used:
from biobb_dna.intrabp_correlations.intraseqcorr import intrasequencecorrelation
input_file_path = "canal_out/canal_output" + "_" + helpar + ".ser"
output_intrabp_correlation_csv_path = helpar+'.intrabp_correlation.csv'
output_intrabp_correlation_jpg_path = helpar+'.intrabp_correlation.jpg'
prop={
'sequence' : seq,
# 'helpar_name' : 'Rise'
}
intrasequencecorrelation(
input_ser_path=input_file_path,
output_csv_path=output_intrabp_correlation_csv_path,
output_jpg_path=output_intrabp_correlation_jpg_path,
properties=prop)
2021-09-27 13:23:32,758 [MainThread ] [INFO ] Creating bpcorrelation_b687484f-f67d-4d85-a298-da6557a8b2ef temporary folder 2021-09-27 13:23:33,212 [MainThread ] [INFO ] Removed: bpcorrelation_b687484f-f67d-4d85-a298-da6557a8b2ef
0
df = pd.read_csv(output_intrabp_correlation_csv_path)
df
| Unnamed: 0 | G | C | G_dup | A | A_dup | T | T_dup | C_dup | G_dup_dup | C_dup_dup | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | G | 1.000000 | -0.358404 | 0.068163 | 0.009405 | 0.001087 | 0.019698 | -0.015957 | 0.007849 | 0.031577 | 0.010621 |
| 1 | C | -0.358404 | 1.000000 | -0.448247 | -0.007771 | 0.026130 | -0.000455 | -0.005333 | -0.013427 | 0.001874 | 0.025165 |
| 2 | G_dup | 0.068163 | -0.448247 | 1.000000 | -0.335686 | -0.048327 | -0.029954 | 0.018416 | 0.013668 | -0.026111 | 0.016239 |
| 3 | A | 0.009405 | -0.007771 | -0.335686 | 1.000000 | -0.293111 | 0.070316 | -0.003760 | 0.006406 | -0.000036 | 0.007260 |
| 4 | A_dup | 0.001087 | 0.026130 | -0.048327 | -0.293111 | 1.000000 | -0.273114 | -0.056474 | 0.001153 | -0.002821 | -0.016981 |
| 5 | T | 0.019698 | -0.000455 | -0.029954 | 0.070316 | -0.273114 | 1.000000 | -0.335193 | -0.006889 | -0.001462 | 0.017912 |
| 6 | T_dup | -0.015957 | -0.005333 | 0.018416 | -0.003760 | -0.056474 | -0.335193 | 1.000000 | -0.447663 | 0.059009 | 0.019932 |
| 7 | C_dup | 0.007849 | -0.013427 | 0.013668 | 0.006406 | 0.001153 | -0.006889 | -0.447663 | 1.000000 | -0.372677 | 0.014933 |
| 8 | G_dup_dup | 0.031577 | 0.001874 | -0.026111 | -0.000036 | -0.002821 | -0.001462 | 0.059009 | -0.372677 | 1.000000 | -0.236134 |
| 9 | C_dup_dup | 0.010621 | 0.025165 | 0.016239 | 0.007260 | -0.016981 | 0.017912 | 0.019932 | 0.014933 | -0.236134 | 1.000000 |
Image(filename=output_intrabp_correlation_jpg_path,width = 600)
from biobb_dna.interbp_correlations.interseqcorr import intersequencecorrelation
input_file_path = "canal_out/canal_output" + "_" + helpar + ".ser"
output_interbp_correlation_csv_path = helpar+'.interbp_correlation.csv'
output_interbp_correlation_jpg_path = helpar+'.interbp_correlation.jpg'
prop={
'sequence' : seq,
# 'helpar_name' : 'Rise'
}
intersequencecorrelation(
input_ser_path=input_file_path,
output_csv_path=output_interbp_correlation_csv_path,
output_jpg_path=output_interbp_correlation_jpg_path,
properties=prop)
2021-09-27 13:23:44,750 [MainThread ] [INFO ] Creating bpcorrelation_c2da8d8b-c3b9-4966-b1c4-eae680e38e2d temporary folder 2021-09-27 13:23:45,222 [MainThread ] [INFO ] Removed: bpcorrelation_c2da8d8b-c3b9-4966-b1c4-eae680e38e2d
0
df = pd.read_csv(output_interbp_correlation_csv_path)
df
| Unnamed: 0 | GC | CG | GA | AA | AT | TT | TC | CG_dup | GC_dup | CG_dup_dup | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | GC | 1.000000 | -0.358404 | 0.068163 | 0.009405 | 0.001087 | 0.019698 | -0.015957 | 0.007849 | 0.031577 | 0.010621 |
| 1 | CG | -0.358404 | 1.000000 | -0.448247 | -0.007771 | 0.026130 | -0.000455 | -0.005333 | -0.013427 | 0.001874 | 0.025165 |
| 2 | GA | 0.068163 | -0.448247 | 1.000000 | -0.335686 | -0.048327 | -0.029954 | 0.018416 | 0.013668 | -0.026111 | 0.016239 |
| 3 | AA | 0.009405 | -0.007771 | -0.335686 | 1.000000 | -0.293111 | 0.070316 | -0.003760 | 0.006406 | -0.000036 | 0.007260 |
| 4 | AT | 0.001087 | 0.026130 | -0.048327 | -0.293111 | 1.000000 | -0.273114 | -0.056474 | 0.001153 | -0.002821 | -0.016981 |
| 5 | TT | 0.019698 | -0.000455 | -0.029954 | 0.070316 | -0.273114 | 1.000000 | -0.335193 | -0.006889 | -0.001462 | 0.017912 |
| 6 | TC | -0.015957 | -0.005333 | 0.018416 | -0.003760 | -0.056474 | -0.335193 | 1.000000 | -0.447663 | 0.059009 | 0.019932 |
| 7 | CG_dup | 0.007849 | -0.013427 | 0.013668 | 0.006406 | 0.001153 | -0.006889 | -0.447663 | 1.000000 | -0.372677 | 0.014933 |
| 8 | GC_dup | 0.031577 | 0.001874 | -0.026111 | -0.000036 | -0.002821 | -0.001462 | 0.059009 | -0.372677 | 1.000000 | -0.236134 |
| 9 | CG_dup_dup | 0.010621 | 0.025165 | 0.016239 | 0.007260 | -0.016981 | 0.017912 | 0.019932 | 0.014933 | -0.236134 | 1.000000 |
Image(filename=output_interbp_correlation_jpg_path,width = 600)
from biobb_dna.intrabp_correlations.intrahpcorr import intrahelparcorrelation
timeseries_shear = timeseries_dir+"/series_shear_"+timesel.value[:-1]+".csv"
timeseries_stretch = timeseries_dir+"/series_stretch_"+timesel.value[:-1]+".csv"
timeseries_stagger = timeseries_dir+"/series_stagger_"+timesel.value[:-1]+".csv"
timeseries_buckle = timeseries_dir+"/series_buckle_"+timesel.value[:-1]+".csv"
timeseries_propel = timeseries_dir+"/series_propel_"+timesel.value[:-1]+".csv"
timeseries_opening = timeseries_dir+"/series_opening_"+timesel.value[:-1]+".csv"
output_helpar_bp_correlation_csv_path = "helpar_bp_correlation.csv"
output_helpar_bp_correlation_jpg_path = "helpar_bp_correlation.jpg"
intrahelparcorrelation(
input_filename_shear=timeseries_shear,
input_filename_stretch=timeseries_stretch,
input_filename_stagger=timeseries_stagger,
input_filename_buckle=timeseries_buckle,
input_filename_propel=timeseries_propel,
input_filename_opening=timeseries_opening,
output_csv_path=output_helpar_bp_correlation_csv_path,
output_jpg_path=output_helpar_bp_correlation_jpg_path)
2021-09-27 13:23:52,341 [MainThread ] [INFO ] Creating hpcorrelation_e7515014-5913-476d-b4fc-1cef3f3dfe64 temporary folder 2021-09-27 13:23:52,714 [MainThread ] [INFO ] Removed: hpcorrelation_e7515014-5913-476d-b4fc-1cef3f3dfe64
0
df = pd.read_csv(output_helpar_bp_correlation_csv_path)
df
| Unnamed: 0 | shear | stretch | stagger | buckle | propel | opening | |
|---|---|---|---|---|---|---|---|
| 0 | shear | 1.000000 | -0.122310 | 0.062783 | -0.024045 | 0.012255 | 0.066297 |
| 1 | stretch | -0.122310 | 1.000000 | -0.024948 | -0.141840 | -0.034225 | 0.396172 |
| 2 | stagger | 0.062783 | -0.024948 | 1.000000 | -0.172773 | -0.253086 | 0.138571 |
| 3 | buckle | -0.024045 | -0.141840 | -0.172773 | 1.000000 | 0.029668 | 0.000972 |
| 4 | propel | 0.012255 | -0.034225 | -0.253086 | 0.029668 | 1.000000 | -0.138484 |
| 5 | opening | 0.066297 | 0.396172 | 0.138571 | 0.000972 | -0.138484 | 1.000000 |
Image(filename=output_helpar_bp_correlation_jpg_path,width = 600)
from biobb_dna.interbp_correlations.interhpcorr import interhelparcorrelation
timeseries_shift = timeseries_dir+"/series_shift_"+timesel.value+".csv"
timeseries_slide = timeseries_dir+"/series_slide_"+timesel.value+".csv"
timeseries_rise = timeseries_dir+"/series_rise_"+timesel.value+".csv"
timeseries_tilt = timeseries_dir+"/series_tilt_"+timesel.value+".csv"
timeseries_roll = timeseries_dir+"/series_roll_"+timesel.value+".csv"
timeseries_twist = timeseries_dir+"/series_twist_"+timesel.value+".csv"
output_helpar_bps_correlation_csv_path = "helpar_bps_correlation.csv"
output_helpar_bps_correlation_jpg_path = "helpar_bps_correlation.jpg"
interhelparcorrelation(
input_filename_shift=timeseries_shift,
input_filename_slide=timeseries_slide,
input_filename_rise=timeseries_rise,
input_filename_tilt=timeseries_tilt,
input_filename_roll=timeseries_roll,
input_filename_twist=timeseries_twist,
output_csv_path=output_helpar_bps_correlation_csv_path,
output_jpg_path=output_helpar_bps_correlation_jpg_path)
2021-09-27 13:24:00,302 [MainThread ] [INFO ] Creating hpcorrelation_0ca27714-2e4d-4f01-8fa8-fefc89cac7fb temporary folder 2021-09-27 13:24:00,629 [MainThread ] [INFO ] Removed: hpcorrelation_0ca27714-2e4d-4f01-8fa8-fefc89cac7fb
0
df = pd.read_csv(output_helpar_bps_correlation_csv_path)
df
| Unnamed: 0 | shift | slide | rise | tilt | roll | twist | |
|---|---|---|---|---|---|---|---|
| 0 | shift | 1.000000 | 0.439831 | 0.006659 | 0.166307 | 0.113906 | 0.318624 |
| 1 | slide | 0.439831 | 1.000000 | -0.207299 | 0.151236 | 0.141627 | 0.541202 |
| 2 | rise | 0.006659 | -0.207299 | 1.000000 | -0.173336 | -0.224263 | 0.147534 |
| 3 | tilt | 0.166307 | 0.151236 | -0.173336 | 1.000000 | -0.032611 | 0.084610 |
| 4 | roll | 0.113906 | 0.141627 | -0.224263 | -0.032611 | 1.000000 | -0.109835 |
| 5 | twist | 0.318624 | 0.541202 | 0.147534 | 0.084610 | -0.109835 | 1.000000 |
Image(filename=output_helpar_bps_correlation_jpg_path,width = 600)
from biobb_dna.intrabp_correlations.intrabpcorr import intrabasepaircorrelation
canal_shear = canal_dir+"/canal_output_shear.ser"
canal_stretch = canal_dir+"/canal_output_stretch.ser"
canal_stagger = canal_dir+"/canal_output_stagger.ser"
canal_buckle = canal_dir+"/canal_output_buckle.ser"
canal_propel = canal_dir+"/canal_output_propel.ser"
canal_opening = canal_dir+"/canal_output_opening.ser"
output_bp_correlation_csv_path = "bp_correlation.csv"
output_bp_correlation_jpg_path = "bp_correlation.jpg"
prop = {
'sequence' : seq
}
intrabasepaircorrelation(
input_filename_shear=canal_shear,
input_filename_stretch=canal_stretch,
input_filename_stagger=canal_stagger,
input_filename_buckle=canal_buckle,
input_filename_propel=canal_propel,
input_filename_opening=canal_opening,
output_csv_path=output_bp_correlation_csv_path,
output_jpg_path=output_bp_correlation_jpg_path,
properties=prop)
2021-09-27 13:24:04,894 [MainThread ] [INFO ] Creating bpcorrelation_7c39f5db-b619-4a37-973e-c7ec5a969460 temporary folder
/anaconda3/envs/biobb_dna_helparms_tutorial/lib/python3.7/site-packages/biobb_dna/intrabp_correlations/intrabpcorr.py:225: MatplotlibDeprecationWarning: You are modifying the state of a globally registered colormap. This has been deprecated since 3.3 and in 3.6, you will not be able to modify a registered colormap in-place. To remove this warning, you can make a copy of the colormap first. cmap = mpl.cm.get_cmap("bwr").copy()
cmap.set_bad(color="gainsboro")
2021-09-27 13:24:06,396 [MainThread ] [INFO ] Removed: bpcorrelation_7c39f5db-b619-4a37-973e-c7ec5a969460
0
df = pd.read_csv(output_bp_correlation_csv_path)
df
| Unnamed: 0 | shear/shear | shear/stretch | shear/stagger | shear/buckle | shear/propel | shear/opening | stretch/shear | stretch/stretch | stretch/stagger | ... | propel/stagger | propel/buckle | propel/propel | propel/opening | opening/shear | opening/stretch | opening/stagger | opening/buckle | opening/propel | opening/opening | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | GC | -0.005642 | 0.018630 | -0.025087 | -0.010539 | 0.024243 | 0.019345 | -0.011061 | 0.021519 | -0.020824 | ... | -0.024222 | 0.005585 | 0.006706 | -0.021960 | -0.010185 | 0.006839 | -0.011742 | 0.001443 | 0.005266 | 0.019129 |
| 1 | CG | 0.020041 | 0.051434 | 0.001951 | -0.059043 | 0.027216 | 0.045336 | -0.051213 | -0.033895 | 0.009731 | ... | 0.047045 | -0.335039 | -0.122872 | -0.067494 | -0.047289 | -0.037659 | 0.023410 | -0.105277 | -0.082899 | 0.005289 |
| 2 | GA | -0.037801 | 0.027801 | -0.015043 | -0.105601 | 0.073992 | 0.025790 | -0.010322 | 0.043308 | 0.011708 | ... | 0.073619 | -0.176741 | -0.051962 | 0.003138 | 0.029520 | 0.005160 | 0.051622 | 0.007026 | -0.017732 | 0.021906 |
| 3 | AA | 0.023788 | -0.000117 | -0.004595 | -0.027747 | -0.031901 | -0.041804 | 0.023817 | -0.006556 | -0.005822 | ... | -0.061579 | -0.147203 | -0.004269 | -0.050649 | -0.039697 | 0.035455 | -0.018540 | -0.093678 | -0.048917 | 0.042879 |
| 4 | AT | 0.032575 | 0.003270 | -0.028156 | -0.070385 | -0.066959 | 0.021602 | 0.014530 | 0.014772 | -0.047132 | ... | -0.113304 | -0.173226 | 0.107649 | 0.019869 | -0.036929 | -0.020634 | -0.030510 | 0.021788 | 0.036952 | 0.119437 |
| 5 | TT | -0.010285 | -0.009504 | 0.035414 | 0.004137 | 0.003110 | 0.052842 | -0.020711 | 0.005863 | -0.007229 | ... | -0.091332 | -0.132041 | 0.183310 | 0.023056 | -0.051025 | 0.023035 | -0.017409 | 0.003068 | 0.010977 | 0.191239 |
| 6 | TC | 0.002390 | 0.003433 | -0.024093 | 0.014480 | -0.012981 | 0.059420 | 0.030606 | 0.011428 | 0.005235 | ... | -0.019650 | -0.152708 | 0.080905 | 0.021846 | -0.012044 | 0.067861 | -0.083172 | 0.048029 | 0.028434 | 0.151051 |
| 7 | CG | -0.027842 | 0.002501 | 0.014905 | 0.010619 | 0.018963 | 0.030263 | -0.020690 | 0.016758 | 0.033150 | ... | 0.063238 | -0.222382 | 0.012536 | -0.021479 | -0.042303 | 0.020077 | 0.014984 | -0.018558 | -0.076957 | 0.006822 |
| 8 | GC | -0.030918 | 0.025377 | 0.047997 | -0.149693 | 0.032799 | 0.004369 | -0.029652 | 0.018781 | 0.037963 | ... | 0.063868 | -0.188492 | -0.044925 | -0.027264 | -0.025809 | 0.027768 | 0.030092 | -0.062296 | 0.025229 | 0.019334 |
| 9 | CG | -0.001727 | 0.039837 | 0.013472 | -0.031293 | -0.014562 | 0.043469 | -0.018693 | -0.008573 | 0.026310 | ... | 0.044891 | -0.303367 | -0.113362 | -0.080200 | -0.018591 | -0.029267 | 0.032287 | -0.095808 | -0.074735 | -0.005167 |
10 rows × 37 columns
Image(filename=output_bp_correlation_jpg_path,width = 800)
from biobb_dna.interbp_correlations.interbpcorr import interbasepaircorrelation
canal_shift = canal_dir+"/canal_output_shift.ser"
canal_slide = canal_dir+"/canal_output_slide.ser"
canal_rise = canal_dir+"/canal_output_rise.ser"
canal_tilt = canal_dir+"/canal_output_tilt.ser"
canal_roll = canal_dir+"/canal_output_roll.ser"
canal_twist = canal_dir+"/canal_output_twist.ser"
output_bps_correlation_csv_path = "bps_correlation.csv"
output_bps_correlation_jpg_path = "bps_correlation.jpg"
prop = {
'sequence' : seq
}
interbasepaircorrelation(
input_filename_shift=canal_shift,
input_filename_slide=canal_slide,
input_filename_rise=canal_rise,
input_filename_tilt=canal_tilt,
input_filename_roll=canal_roll,
input_filename_twist=canal_twist,
output_csv_path=output_bps_correlation_csv_path,
output_jpg_path=output_bps_correlation_jpg_path,
properties=prop)
2021-09-27 13:24:12,743 [MainThread ] [INFO ] Creating bpcorrelation_ef1ffc32-4ee3-4a3c-91ec-ade48228675a temporary folder
/anaconda3/envs/biobb_dna_helparms_tutorial/lib/python3.7/site-packages/biobb_dna/interbp_correlations/interbpcorr.py:225: MatplotlibDeprecationWarning: You are modifying the state of a globally registered colormap. This has been deprecated since 3.3 and in 3.6, you will not be able to modify a registered colormap in-place. To remove this warning, you can make a copy of the colormap first. cmap = mpl.cm.get_cmap("bwr").copy()
cmap.set_bad(color="gainsboro")
2021-09-27 13:24:13,888 [MainThread ] [INFO ] Removed: bpcorrelation_ef1ffc32-4ee3-4a3c-91ec-ade48228675a
0
df = pd.read_csv(output_bps_correlation_csv_path)
df
| Unnamed: 0 | shift/shift | shift/slide | shift/rise | shift/tilt | shift/roll | shift/twist | slide/shift | slide/slide | slide/rise | ... | roll/rise | roll/tilt | roll/roll | roll/twist | twist/shift | twist/slide | twist/rise | twist/tilt | twist/roll | twist/twist | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | GCG | -0.034515 | -0.013978 | 0.028504 | -0.016774 | 0.007526 | -0.019706 | -0.005945 | 0.003258 | 0.016998 | ... | -0.024289 | 0.034074 | -0.014922 | 0.002918 | 0.024087 | 0.011902 | -0.018600 | -0.000658 | 0.004749 | 0.031633 |
| 1 | CGA | -0.620729 | -0.213534 | 0.153552 | -0.379811 | -0.069482 | 0.090324 | -0.185796 | 0.035204 | -0.072609 | ... | 0.185182 | 0.239957 | -0.327237 | 0.146238 | -0.301986 | -0.066687 | -0.230273 | -0.149935 | 0.234940 | -0.358276 |
| 2 | GAA | -0.556258 | -0.055634 | 0.106571 | -0.307558 | -0.124963 | 0.111320 | 0.336513 | 0.207062 | 0.010327 | ... | 0.243220 | 0.222142 | -0.356105 | 0.262547 | -0.112359 | -0.169531 | -0.267369 | -0.017553 | 0.222109 | -0.448559 |
| 3 | AAT | -0.422776 | 0.121578 | 0.172917 | -0.120018 | -0.234489 | 0.333776 | 0.200294 | 0.055308 | -0.062561 | ... | 0.182963 | 0.269344 | -0.261031 | 0.037170 | -0.055827 | -0.035366 | -0.125160 | -0.008524 | 0.080896 | -0.336058 |
| 4 | ATT | -0.135159 | 0.048764 | 0.052558 | 0.152407 | -0.137398 | 0.093877 | -0.058611 | 0.177430 | 0.140074 | ... | 0.189502 | 0.241552 | -0.262516 | 0.016017 | -0.027548 | -0.024130 | -0.059852 | -0.003261 | 0.031997 | -0.293106 |
| 5 | TTC | -0.115228 | 0.074008 | 0.011805 | 0.204128 | -0.216200 | -0.032101 | -0.028032 | 0.184504 | 0.114301 | ... | 0.113645 | 0.247625 | -0.299766 | -0.015864 | -0.148998 | 0.111239 | 0.075143 | 0.053240 | -0.015885 | -0.273020 |
| 6 | TCG | -0.428290 | -0.230813 | 0.116441 | 0.057122 | -0.208447 | -0.118440 | -0.121029 | 0.055468 | 0.090725 | ... | 0.093449 | 0.287541 | -0.273905 | 0.101453 | -0.316904 | -0.253028 | -0.076103 | -0.114819 | 0.066582 | -0.335519 |
| 7 | CGC | -0.577276 | -0.389311 | 0.286179 | -0.324285 | -0.179625 | 0.071846 | 0.054698 | 0.233093 | -0.148480 | ... | 0.140274 | 0.210203 | -0.362062 | 0.244422 | -0.150061 | -0.097582 | -0.265051 | -0.021533 | 0.240612 | -0.447802 |
| 8 | GCG | -0.626682 | 0.229957 | 0.290226 | -0.338901 | -0.240202 | 0.328321 | 0.213949 | 0.040634 | 0.007077 | ... | 0.287660 | 0.196857 | -0.327687 | 0.263437 | -0.127565 | -0.149207 | -0.200700 | -0.092674 | 0.171328 | -0.372676 |
9 rows × 37 columns
Image(filename=output_bps_correlation_jpg_path,width = 800)
Questions, issues, suggestions and comments are really welcome!
GitHub issues:
BioExcel forum: